29 research outputs found
Trajectory Deformations from Physical Human-Robot Interaction
Robots are finding new applications where physical interaction with a human
is necessary: manufacturing, healthcare, and social tasks. Accordingly, the
field of physical human-robot interaction (pHRI) has leveraged impedance
control approaches, which support compliant interactions between human and
robot. However, a limitation of traditional impedance control is that---despite
provisions for the human to modify the robot's current trajectory---the human
cannot affect the robot's future desired trajectory through pHRI. In this
paper, we present an algorithm for physically interactive trajectory
deformations which, when combined with impedance control, allows the human to
modulate both the actual and desired trajectories of the robot. Unlike related
works, our method explicitly deforms the future desired trajectory based on
forces applied during pHRI, but does not require constant human guidance. We
present our approach and verify that this method is compatible with traditional
impedance control. Next, we use constrained optimization to derive the
deformation shape. Finally, we describe an algorithm for real time
implementation, and perform simulations to test the arbitration parameters.
Experimental results demonstrate reduction in the human's effort and
improvement in the movement quality when compared to pHRI with impedance
control alone
Learning Latent Representations to Co-Adapt to Humans
When robots interact with humans in homes, roads, or factories the human's
behavior often changes in response to the robot. Non-stationary humans are
challenging for robot learners: actions the robot has learned to coordinate
with the original human may fail after the human adapts to the robot. In this
paper we introduce an algorithmic formalism that enables robots (i.e., ego
agents) to co-adapt alongside dynamic humans (i.e., other agents) using only
the robot's low-level states, actions, and rewards. A core challenge is that
humans not only react to the robot's behavior, but the way in which humans
react inevitably changes both over time and between users. To deal with this
challenge, our insight is that -- instead of building an exact model of the
human -- robots can learn and reason over high-level representations of the
human's policy and policy dynamics. Applying this insight we develop RILI:
Robustly Influencing Latent Intent. RILI first embeds low-level robot
observations into predictions of the human's latent strategy and strategy
dynamics. Next, RILI harnesses these predictions to select actions that
influence the adaptive human towards advantageous, high reward behaviors over
repeated interactions. We demonstrate that -- given RILI's measured performance
with users sampled from an underlying distribution -- we can probabilistically
bound RILI's expected performance across new humans sampled from the same
distribution. Our simulated experiments compare RILI to state-of-the-art
representation and reinforcement learning baselines, and show that RILI better
learns to coordinate with imperfect, noisy, and time-varying agents. Finally,
we conduct two user studies where RILI co-adapts alongside actual humans in a
game of tag and a tower-building task. See videos of our user studies here:
https://youtu.be/WYGO5amDXb
LIMIT: Learning Interfaces to Maximize Information Transfer
Robots can use auditory, visual, or haptic interfaces to convey information
to human users. The way these interfaces select signals is typically
pre-defined by the designer: for instance, a haptic wristband might vibrate
when the robot is moving and squeeze when the robot stops. But different people
interpret the same signals in different ways, so that what makes sense to one
person might be confusing or unintuitive to another. In this paper we introduce
a unified algorithmic formalism for learning co-adaptive interfaces from
scratch. Our method does not need to know the human's task (i.e., what the
human is using these signals for). Instead, our insight is that interpretable
interfaces should select signals that maximize correlation between the human's
actions and the information the interface is trying to convey. Applying this
insight we develop LIMIT: Learning Interfaces to Maximize Information Transfer.
LIMIT optimizes a tractable, real-time proxy of information gain in continuous
spaces. The first time a person works with our system the signals may appear
random; but over repeated interactions the interface learns a one-to-one
mapping between displayed signals and human responses. Our resulting approach
is both personalized to the current user and not tied to any specific interface
modality. We compare LIMIT to state-of-the-art baselines across controlled
simulations, an online survey, and an in-person user study with auditory,
visual, and haptic interfaces. Overall, our results suggest that LIMIT learns
interfaces that enable users to complete the task more quickly and efficiently,
and users subjectively prefer LIMIT to the alternatives. See videos here:
https://youtu.be/IvQ3TM1_2fA
Should Collaborative Robots be Transparent?
We often assume that robots which collaborate with humans should behave in
ways that are transparent (e.g., legible, explainable). These transparent
robots intentionally choose actions that convey their internal state to nearby
humans: for instance, a transparent robot might exaggerate its trajectory to
indicate its goal. But while transparent behavior seems beneficial for
human-robot interaction, is it actually optimal? In this paper we consider
collaborative settings where the human and robot have the same objective, and
the human is uncertain about the robot's type (i.e., the robot's internal
state). We extend a recursive combination of Bayesian Nash equilibrium and the
Bellman equation to solve for optimal robot policies. Interestingly, we
discover that it is not always optimal for collaborative robots to be
transparent; instead, human and robot teams can sometimes achieve higher
rewards when the robot is opaque. In contrast to transparent robots, opaque
robots select actions that withhold information from the human. Our analysis
suggests that opaque behavior becomes optimal when either (a) human-robot
interactions have a short time horizon or (b) users are slow to learn from the
robot's actions. We extend this theoretical analysis to user studies across 43
total participants in both online and in-person settings. We find that --
during short interactions -- users reach higher rewards when working with
opaque partners, and subjectively rate opaque robots as about equal to
transparent robots. See videos of our experiments here:
https://youtu.be/u8q1Z7WHUu